Search CORE

49 research outputs found

Distributed Bayesian Probabilistic Matrix Factorization

Author: Aa Tom Vander
Chakroun Imen
Haber Tom
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/05/2017
Field of study

Matrix factorization is a common machine learning technique for recommender systems. Despite its high prediction accuracy, the Bayesian Probabilistic Matrix Factorization algorithm (BPMF) has not been widely used on large scale data because of its high computational cost. In this paper we propose a distributed high-performance parallel implementation of BPMF on shared memory and distributed architectures. We show by using efficient load balancing using work stealing on a single node, and by using asynchronous communication in the distributed version we beat state of the art implementations

arXiv.org e-Print Archive

Crossref

A GPU-accelerated Branch-and-Bound Algorithm for the Flow-Shop Scheduling Problem

Author: Chakroun Imen
Mohand Mezmaz
Nouredine Melab
Tuyttens Daniel
Publication venue
Publication date: 01/01/2012
Field of study

Branch-and-Bound (B&B) algorithms are time intensive tree-based exploration methods for solving to optimality combinatorial optimization problems. In this paper, we investigate the use of GPU computing as a major complementary way to speed up those methods. The focus is put on the bounding mechanism of B&B algorithms, which is the most time consuming part of their exploration process. We propose a parallel B&B algorithm based on a GPU-accelerated bounding model. The proposed approach concentrate on optimizing data access management to further improve the performance of the bounding mechanism which uses large and intermediate data sets that do not completely fit in GPU memory. Extensive experiments of the contribution have been carried out on well known FSP benchmarks using an Nvidia Tesla C2050 GPU card. We compared the obtained performances to a single and a multithreaded CPU-based execution. Accelerations up to x100 are achieved for large problem instances

arXiv.org e-Print Archive

HAL - Lille 3

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

An Adaptative Multi-GPU based Branch-and-Bound. A Case Study: the Flow-Shop Scheduling Problem

Author: Chakroun Imen
Melab Nouredine
Publication venue
Publication date: 21/06/2012
Field of study

Solving exactly Combinatorial Optimization Problems (COPs) using a Branch-and-Bound (B&B) algorithm requires a huge amount of computational resources. Therefore, we recently investigated designing B&B algorithms on top of graphics processing units (GPUs) using a parallel bounding model. The proposed model assumes parallelizing the evaluation of the lower bounds on pools of sub-problems. The results demonstrated that the size of the evaluated pool has a significant impact on the performance of B&B and that it depends strongly on the problem instance being solved. In this paper, we design an adaptative parallel B&B algorithm for solving permutation-based combinatorial optimization problems such as FSP (Flow-shop Scheduling Problem) on GPU accelerators. To do so, we propose a dynamic heuristic for parameter auto-tuning at runtime. Another challenge of this work is to exploit larger degrees of parallelism by using the combined computational power of multiple GPU devices. The approach has been applied to the permutation flow-shop problem. Extensive experiments have been carried out on well-known FSP benchmarks using an Nvidia Tesla S1070 Computing System equipped with two Tesla T10 GPUs. Compared to a CPU-based execution, accelerations up to 105 are achieved for large problem instances.Comment: 14th IEEE International Conference on High Performance Computing and Communications, HPCC 2012 (2012

arXiv.org e-Print Archive

HAL - Lille 3

INRIA a CCSD electronic archive server

Reducing Thread Divergence in GPU-based B&B Applied to the Flow-shop problem

Author: Bendjoudi Ahcène
Chakroun Imen
Melab Nouredine
Publication venue: HAL CCSD
Publication date: 10/09/2011
Field of study

International audienceIn this paper,we propose a pioneering work on designing and programming B&B algorithms on GPU. To the best of our knowledge, no contribution has been proposed to raise such challenge. We focus on the parallel evaluation of the bounds for the Flow-shop scheduling problem. To deal with thread divergence caused by the bounding operation, we investigate two software based approaches called thread data reordering and branch refactoring. Experiments reported that parallel evaluation of bounds speeds up execution up to 54.5 times compared to a CPU version

HAL - Lille 3

INRIA a CCSD electronic archive server

Reducing thread divergence in a GPU-accelerated branch-and-bound algorithm

Author: Bendjoudi Ahcène
Chakroun Imen
Melab Nouredine
Mezmaz Mohand
Publication venue: 'Wiley'
Publication date: 01/01/2012
Field of study

International audienceIn this paper, we address the design and implementation of GPU-accelerated Branch-and-Bound algorithms (B&B) for solving Flow-shop scheduling optimization problems (FSP). Such applications are CPU-time consuming and highly irregular. On the other hand, GPUs are massively multi-threaded accelerators using the SIMD model at execution. A major issue which arises when executing on GPU a B&B applied to FSP is thread or branch divergence. Such divergence is caused by the lower bound function of FSP which contains many irregular loops and conditional instructions. Our challenge is therefore to revisit the design and implementation of B&B applied to FSP dealing with thread divergence. Extensive experiments of the proposed approach have been carried out on well-known FSP benchmarks using an Nvidia Tesla C2050 GPU card. Compared to a CPU-based execution, accelerations up to ×77.46 are achieved for large problem instances

HAL - Lille 3

INRIA a CCSD electronic archive server

Hal-Diderot

Optimisation parallèle utilisant/ pour le calcul haute performance multi- et multi-core

Author: Chakroun Imen
Melab Nouredine
Zomaya Albert
Publication venue: 'Elsevier BV'
Publication date: 01/02/2018
Field of study

International audienceNo abstrac

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Large-scale wearable data reveal digital phenotypes for daily-life stress detection

Author: Chakroun Imen
Claes Stephan
Cornelis Jan
D'Hondt Ellie
De Raedt Walter
Janssens Olivier
Schiavone Giuseppina
Smets Elena
Van Diest Ilse
Van Hoecke Sofie
Van Hoof Chris
Velazquez Emmanuel Rios
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Physiological signals have shown to be reliable indicators of stress in laboratory studies, yet large-scale ambulatory validation is lacking. We present a large-scale cross-sectional study for ambulatory stress detection, consisting of 1002 subjects, containing subjects' demographics, baseline psychological information, and five consecutive days of free-living physiological and contextual measurements, collected through wearable devices and smartphones. This dataset represents a healthy population, showing associations between wearable physiological signals and self-reported daily-life stress. Using a data-driven approach, we identified digital phenotypes characterized by self-reported poor health indicators and high depression, anxiety and stress scores that are associated with blunted physiological responses to stress. These results emphasize the need for large-scale collections of multi-sensor data, to build personalized stress models for precision medicine

Ghent University Academic Bibliography

Directory of Open Access Journals